Skip to content

Conversation

shubhaat
Copy link
Contributor

@shubhaat shubhaat commented Aug 5, 2025

Indexing strategy recommendations in accordance with small indices causing a large overhead

Indexing strategy recommendations
@shubhaat shubhaat requested a review from a team as a code owner August 5, 2025 00:02
Copy link

github-actions bot commented Aug 5, 2025

🔍 Preview links for changed docs

Comment on lines +53 to +59
* To ensure optimal performance and cost-effectiveness for your project, it’s important to consider how you structure your data.
* Consolidate small indices for better efficiency. We recommend avoiding a design where your project contains hundreds of very small indices, specifically those under 1GB each.
* Why is this important?
* Every index in Elasticsearch has a certain amount of resource overhead. This is because Elasticsearch needs to maintain metadata for each index to keep it running smoothly. When you have a very large number of small indices, the combined overhead from all of them can consume more CPU resources than if the same data were stored in fewer, larger indices. This can lead to higher resource consumption and hence higher costs and potentially impact the overall performance of your project.

* Recommended Approach
* If your use case naturally generates many small, separate streams of data, we advise implementing a process to consolidate them into fewer, larger indices. This practice leads to more efficient resource utilization. By grouping your data into larger indices, you can ensure a more performant and cost-efficient experience with Elasticsearch Serverless.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* To ensure optimal performance and cost-effectiveness for your project, it’s important to consider how you structure your data.
* Consolidate small indices for better efficiency. We recommend avoiding a design where your project contains hundreds of very small indices, specifically those under 1GB each.
* Why is this important?
* Every index in Elasticsearch has a certain amount of resource overhead. This is because Elasticsearch needs to maintain metadata for each index to keep it running smoothly. When you have a very large number of small indices, the combined overhead from all of them can consume more CPU resources than if the same data were stored in fewer, larger indices. This can lead to higher resource consumption and hence higher costs and potentially impact the overall performance of your project.
* Recommended Approach
* If your use case naturally generates many small, separate streams of data, we advise implementing a process to consolidate them into fewer, larger indices. This practice leads to more efficient resource utilization. By grouping your data into larger indices, you can ensure a more performant and cost-efficient experience with Elasticsearch Serverless.
* To ensure optimal performance and cost-effectiveness for your project, it’s important to consider how you structure your data.
* Consolidate small indices for better efficiency. We recommend avoiding a design where your project contains hundreds of very small indices, specifically those under 1GB each.
* Why is this important?
* Every index in {{es}} has a certain amount of resource overhead. This is because {{es}} maintains metadata for each index to keep it running smoothly. When you have a very large number of small indices, the combined overhead from all of them can consume more CPU resources than if the same data were stored in fewer, larger indices. This can lead to higher resource consumption and hence higher costs, and can also impact the overall performance of your project.
* Recommended Approach
* If your use case naturally generates many small, separate streams of data, we advise implementing a process to consolidate them into fewer, larger indices. This practice leads to more efficient resource utilization. By grouping your data into larger indices, you can ensure a more performant and cost-efficient experience with {{es-serverless}}.

@kilfoyle
Copy link
Contributor

kilfoyle commented Aug 5, 2025

Looks good @shubhaat! 🚀
I added some small suggestions. Here's how it'll appear:

screen

@shubhaat
Copy link
Contributor Author

shubhaat commented Aug 5, 2025

@kilfoyle Thanks indeed, I saw the formatting weirdness. Would you be kind to approve and we can merge and have this out and about.

@shubhaat shubhaat requested a review from kilfoyle August 5, 2025 20:51
* Every index in Elasticsearch has a certain amount of resource overhead. This is because Elasticsearch needs to maintain metadata for each index to keep it running smoothly. When you have a very large number of small indices, the combined overhead from all of them can consume more CPU resources than if the same data were stored in fewer, larger indices. This can lead to higher resource consumption and hence higher costs and potentially impact the overall performance of your project.

* Recommended Approach
* If your use case naturally generates many small, separate streams of data, we advise implementing a process to consolidate them into fewer, larger indices. This practice leads to more efficient resource utilization. By grouping your data into larger indices, you can ensure a more performant and cost-efficient experience with Elasticsearch Serverless.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* If your use case naturally generates many small, separate streams of data, we advise implementing a process to consolidate them into fewer, larger indices. This practice leads to more efficient resource utilization. By grouping your data into larger indices, you can ensure a more performant and cost-efficient experience with Elasticsearch Serverless.
* If your use case naturally generates many small, separate streams of data, we advise implementing a process to consolidate them into fewer, larger indices. This practice leads to more efficient resource utilization. By grouping your data into larger indices, you can ensure a more performant and cost-efficient experience with {{es-serverless}}.

* To ensure optimal performance and cost-effectiveness for your project, it’s important to consider how you structure your data.
* Consolidate small indices for better efficiency. We recommend avoiding a design where your project contains hundreds of very small indices, specifically those under 1GB each.
* Why is this important?
* Every index in Elasticsearch has a certain amount of resource overhead. This is because Elasticsearch needs to maintain metadata for each index to keep it running smoothly. When you have a very large number of small indices, the combined overhead from all of them can consume more CPU resources than if the same data were stored in fewer, larger indices. This can lead to higher resource consumption and hence higher costs and potentially impact the overall performance of your project.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* Every index in Elasticsearch has a certain amount of resource overhead. This is because Elasticsearch needs to maintain metadata for each index to keep it running smoothly. When you have a very large number of small indices, the combined overhead from all of them can consume more CPU resources than if the same data were stored in fewer, larger indices. This can lead to higher resource consumption and hence higher costs and potentially impact the overall performance of your project.
* Every index in {{es}} has a certain amount of resource overhead. This is because {{es}} needs to maintain metadata for each index to keep it running smoothly. When you have a very large number of small indices, the combined overhead from all of them can consume more CPU resources than if the same data were stored in fewer, larger indices. This can lead to higher resource consumption and hence higher costs and potentially impact the overall performance of your project.

Copy link
Contributor

@kilfoyle kilfoyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @shubhaat. I've approved.

Please add in my two suggestions though, just to remove extra spacing and use our variables for the product names.

@shubhaat shubhaat merged commit a6d19bf into main Aug 5, 2025
8 checks passed
@shubhaat shubhaat deleted the indexing-strategy-recommendations branch August 5, 2025 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants